Perceptual Factor Analysis for Speech Enhancement
نویسندگان
چکیده
This paper presents a new speech enhancement approach originated from factor analysis (FA) framework. FA is a data analysis model where the relevant common factors can be extracted from observations. A factor loading matrix is found and a resulting model error is introduced for each observation. Interestingly, FA is a subspace approach properly representing the noisy speech. This approach partitions the space of noisy speech into a principal subspace containing clean speech and a complimentary (minor) subspace containing the residual speech and noise. We show that FA is a generalized data model compared to signal subspace approach. To perform FA speech enhancement, we present a perceptual optimization procedure that minimizes the signal distortion subject to the energies of residual speech and noise under a specified level. Importantly, we present a hypothesis testing approach to optimally perform subspace decomposition. In the experiments, we implement perceptual FA speech enhancement using Aurora2 corpus. We find that proposed approach achieves desirable speech recognition rates especially when signal-to-noise ratio is lower than 5 dB.
منابع مشابه
Performance analysis of various single channel speech enhancement algorithms for automatic speech recognition
This paper analyzes the performance of various single channel speech enhancement systems when they are applied to automatic speech recognition (ASR) systems as a preprocessor. Until now the researches on speech enhancement algorithms have focused on improving the perceptual quality of speech signal. However, it has not been verified yet whether the improvements of the perceptual quality also in...
متن کاملA New Speech Enhancement Technique to Reduce Residual Noise Using Perceptual Constrained Spectral Weighted Factors
This paper deals with residual musical noise which results from the perceptual speech enhancement type algorithms and especially using wiener filtering approach. Perceptual speech enhancement techniques perform better than the non perceptual techniques, most of them still return a trouble residual musical noise. This is due to that only noise above the noise masking threshold (NMT) is filtered ...
متن کاملL2 Learners’ Lexical Inferencing: Perceptual Learning Style Preferences, Strategy Use, Density of Text, and Parts of Speech as Possible Predictors
This study was intended first to categorize the L2 learners in terms of their learning style preferences and second to investigate if their learning preferences are related to lexical inferencing. Moreover, strategies used for lexical inferencing and text related issues of text density and parts of speech were studied to determine their moderating effects and the best predictors of lexical infe...
متن کاملThe Application of Nonlinear Spectral Subtraction Method on Millimeter Wave Conducted Speech Enhancement
A nonlinear multiband spectral subtraction method is investigated in this study to reduce the colored electronic noise in millimeter wave MMW radar conducted speech. Because the oversubtraction factor of each Bark frequency band can be adaptively adjusted, the nonuniform effects of colored noise in the spectrum of the MMW radar speech can be taken into account in the enhancement process. Both t...
متن کاملFrom Maskee to Audible Noise in Perceptual Speech Enhancement
A new analysis of perceptual speech enhancement is presented. It focuses on the fact that if only noise above the masking threshold is filtered, then noise below the masking threshold, but above the absolute threshold of hearing, can become audible after the masker filtering. This particular drawback of some perceptual filters, hereafter called the maskee-to-audible-noise (MAN) phenomenon, favo...
متن کامل